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For a computational flow simulation tool to be useful in a design environment^ 
efficient To d^elop such a tool for incompressible flow applications, a number of d^erent implictt schmes are 

methods bv at least a factor of 2. 


Introduction 

A LTHOUGH computational fluid dynamics (CFD) now has a 
lot to offer an engineer, there is still room for significant im- 
provement. The efficiency and robustness of a flow solver are two 
of its most important features if it is to be successfully applied as a 
tool in the design process. This is especially true of Navier-Stokes 
methods, which require very fine resolution grids, particularly for 
hieh-Rey nolds-number flows. When engineers need to study a num- 
ber of design parameters, they often need to obtain hundreds o 
steady-state solutions to a particular problem. Use of optimization 
methods in design may require numerous flowfield solutions. In 
both these instances, a flow solver needs to produce solutions with- 
out requiring the users to tune the input numerical parameters for 

each computation that they run. 

The goal of the current study is to determine an efficient algorithm 
for obtaining steady-state solutions to the incompressible Navier- 
Stokes solutions for two-dimensional flow problems. There is a wide 
range of applicability of such a flow solver in engineering flow 
analysis, including high-lift multi-element airfoil computations and 
propulsion flow analysis. 2 These investigations are being performed 
for two-dimensional flows because these are cheaper allowing the 
investigation of many parameters, schemes, and flow problems— 
and because a two-dimensional analysis tool can be quite valuable 
in its own right. This study will use the 1NS2D flow solver, • and 
later the results can be utilized to improve the INS3D flow solver. 

There are many different types of solution techniques for steady- 
state incompressible Navier-Stokes computations too numerous 
to discuss in detail here. The INS2D and 1NS3D flow solvers by 
default use an implicit Gauss-Seidel line-relaxation (LR) scheme. 
This scheme has performed quite well for a large number of differ- 
ent flow problems 1-5 but has been shown to have some convergence 
problems for very fine meshes with multiple zones. For example, 
some fine-resolution, multizone grids used in recent multi-element 
airfoil calculations 1 have taken several thousand iterations to con- 
verge. whereas most cases that are run with this flow solver converge 
within 200 iterations. 
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Recent work in iterative, Krylov-space matrix solvers^ 7 has 
shown that some of these methods are applicable to CFD flow 
solvers. In particular, the generalized minimal residual (^MRES) 
method 8 is well suited to the matrix problems arising in implicit CFD 
solvers There are two distinct ways to implement GMRES in a flow 
solver. One approach is to use GMRES to solve the lineanzed system 
of equations resulting from the application of a time-marching type 
of scheme, such as an implicit Euler scheme. The other formulation 
is to use a nonlinear extension of GMRES to directly solve the dis- 
Crete form of the steady-state equations. The first approach has been 
attempted by many authors; see Refs. 9-1 1 for recent examples. The 
second approach was introduced by Wigton et al. 

The most important aspect of implementing GMRES is the pre- 
conditioning of the system of equations. For implementation in a 
CFD code, a good preconditioner is necessary for GMRES to con- 
verge. The current work investigates the use of GMRES to sol ve the 
linearized system of equations in delta form in the INS2D code. The 
code for the GMRES solver was obtained from the template sott- 
ware available as a companion to Ref. 7. The GMRES method can be 
easily implemented so that any existing solution process can be uti- 
lized as a preconditioner. One commonly used preconditioner for the 
GMRES method is an incomplete lower-upper (ILU) factorization 
with zero additional fill. See Meijerink and van der Vorst for a dis- 
cussion of ILU methods. The ILU solution scheme was added as an 
option to the INS2D code, along with a Point-Jacobi relaxation (PR) 
scheme, to go along with the Gauss-Seidel line-relaxation scheme 
already in the code. Presented in this paper are comparisons between 
six different solution processes: the three aforementioned algorithms 
run independently, and the GMRES method using these three algo- 
rithms as preconditioned, denoted as GMRES -1- ILU, GMRES+PR, 
and GMRES+LR. 

In the following sections, the features of the INS2D flow so ver 
are discussed, followed by a presentation of each of the implicit 
methods used in this study. The next section presents the computed 
results of each of these methods for three different flow problems: 
laminar flow over a backward-facing step, turbulent flow over a 
NACA 4412 airfoil, and turbulent flow over a three-element airfoil. 

Flow Solver 

The INS2D flow solver 3 - 4 solves the Reynolds-averaged incom- 
pressible Navier-Stokes equations using the method of artificial 
compressibility, 14 which adds a pseudotime derivative of pressure 
p to the continuity equation, resulting in 


?£ = -pv ■ 
Or 


( 1 ) 
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where t is the pseudotime. V is the velocity vector, and B is the arti 
ficral compressibility constant. This relaxes the elliptic nature of the 
equations and results in a hyperbolic-parabolic syst m ^e so er 

Iblem .S" 1 ? b ° th Stead ^ State a " d ^e-dependent flow 
problems, although the current study is concerned only with steadv- 

state solutions. The INS2D code is a finite difference, structured- 

fit, fl S ° 11 ' S capable of handling multiple-zone grids using 
either a patched multiblock (pointwise continuous) interface or an 

atThd h Ch,m fh a ' n '! rfaCe between z °nes. The boundary conditions 
fmni .-, h f v b ° Undanes and at zonal boundaries are Applied in an 
imp. c,t fashion during the solution process. A third-order upwind 
differencing scheme based on the method of Roe 15 is used to dis- 
cretize the convective terms, and the viscous terms are differenced 
8 S f ec . 0nd_0rd ® r central differences. The system of equations is 

Th^resuhi'n/d 6 ,' 111 ’ 6 USin§ "" imp]idt Eu,er time discretization. 
The resulting discrete system of equations has the form 
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At 




( 2 ) 


, W Sn ? ' S thC VeCt ° r ° f de P er| dent variables, (pressure, u- and 
u-vefocity components). At is the time-step size in pseudotime 

DosedTf'n 1 " IS , r C lteradon nurr >ber, and R is the residual, com- 
posed of the discrete form of the convective and viscous terms This 
system is linearized about pseudotime-level n, resulting in 


( 1 dR" \ 

(m + Jq) a Q = ~^ 


(3) 


where A Q Q Q" This equation is iterated until a steady- 
state solution is obtained at which time R" 0. In the current 
unp ei mentation the Jacobian of R on the left-hand side (LHsTo 
Eq. (3) is formed using a residual based on first-order differencing 
of the convective terms, whereas third-order differencing is used on 

iuxdfff ^ ( ™ S) In addition - proximate KS of 

cause e?a« n jaLh° m theu P win d-differencing scheme are used, be- 

o a rensorrst p fL°r thCSe tCrmS Would require the f° rm ation 

that he iHHp? ' 6 ° r m ° re details) ’ and jt has been assumed 
diat the added computational costs would outweigh any benefit 

LHS mrt 0 V S iS US f d t0 reduce the bandwidth of the resulting 
LHS matrix, resulting in lower memory and computational require- 
ments for the solution of Eq. (3). However, this use of approxlate 
acobians can also slow the convergence to a steady state, 
or turbulent flow calculations, the current study uses the turbu- 

udon off ' ° f , Baldwin and Barth ' 7 This model requires the so- 
T i l , n6 e convective-diffusive partial-differential equation 
This equation is uncoupled from the mean-flow equations during 
he solution process. The convective terms in the turbulence model 
are discretized using a first-order upwind-differencing scheme The 

SB? 5 (TexT ati r f ° r thC tUrbU,enCe -deltas the same 6 
orm as Eq. (3), except Q now represents a single variable at each 

gnd point instead of three variables; the LHS of Eq. 3 is a banded 
matrix composed of five diagonals, each containing scalar entries 
In all cases presented here, the turbulence model equation is solved 
using the same implicit scheme as the mean-flow equations. 

Implicit Schemes 

of Fn B rt? ) W3S SOlVed exactly at each time ste P- and if 'be LHS 
of Eq. (3) was composed of the exact Jacobians, and an infinite 

time step was used, this would be a Newton iteration In this ca 1 
quadratic convergence could be obtained if the Q” was close to 
he exact solution. Since the LHS is composed of an approSma e 

ST It 5 HS ' >'"» model i/LtS 

not^ ™ flow equatl0ns ’ ‘be current solution procedure is 

so u inn , T ' t? 1 ' 0 " ThuS the S° al is 10 ^tain an approximate 
solution to Eq. (3) m an efficient manner. Several methods are used 

S a r? 3 P : t0 do Ws ™ s terete form of the matrix from the LHS 
the^r ? 3 P entadla gonal banded matrix, where each entry on 
the diagonal consists of 3 x 3 blocks. Equation ( 3 ) can be written as 

[D -° 0, A, B, C. 0 0,E]AQ = -R" (4) 

tatlon of a n of f ^ ^ b,OCk diagona,s - In the implemen- 

of all of the implicit schemes, the code first computes and 


prS're^sT 5 ^ ^ (4) 3nd ,hen P roceeds wi ‘h the solution 
procedure This storage requires 48/V words, where N is the total 

number of gnd points. In the following, a subscript i refers to a 
single grid-point index when one can consider all of the grid points 
n a smgie vector of length N. The subscripts j and k are indE 
the dai com P u ‘ a '* ona *" s P ace directions ? and 17 , respectively When 

ILU Factorization 

In the ILU formulation, the matrix on the LHS of Ea (41 is 
replaced with the following factors: ^ * 

1°-° C, 0 0, E] 


B ' = A, - A,C’_ ] - D,E\, 

c; = [fi,r‘c, 


jmjx 


e; = [B;r<E, 

Multiplying these factors together, one sees that a matrix of the same 
structure as the original LHS is obtained, except that there are addi- 

and a v!? nals u 0fn ° nzer0entnes created Just above the D diagonal 
and just below the E diagonal. These new entries are ignored fn the 

ILu7oTsee 0 ReM2f' S dTi" “ ILU W ’ th Zer0 addi ‘ ional fil '- or 
additional fill °" lmp ' ementlng ™ schemes with 

namdv I ^^' Ver S ° me significam initialization work, 

mely the computation and storage of the B’ diagonal This re 

flli^kd Mtra St ° rage locations - w hen used as a preconditioner 
this done once at the beginning of the GMRES solution process 
and used reputedly during the GMRES iteration cycle This new 
set of factors gives an easy system to solve. The solution process 

Point Relaxation 

formed k R alg , 0rit 1 hm iterative| y solves a block diagonal system 
y multiplying all of the nonmain block diagonals by the 
urrent estimate for A Q and moving this to the RHS This is done 

sequentia],y ,hrough the mesh - A foward 
[B]AQ l+ < = -/?"- [0,0,..., 0 . A]AQ ,+ I 
-[C, 0 0 , E]AQ' 

and a backward sweep is performed by solving 
[fl]A£ / + 1 = -*» - [ D , 0 o A]AQ , 

-[C, 0 0, E]AQ' +I 

Joi h n e ts C L rr one? n,PUta d ti0nS ’ af ° rWard swee P plus a backward sweep 
counts as one sweep, denoted as PR( 1 ). The solution process is ini- 
.alized by setting Afi to zero. Then, a lower-upper ^ £ 0 - 

0 "e°thl louar^ i$ f ° rmed - ^ 1,16 ^ b^of o^ratilns 
process ‘ '? equatl0n ,s minimized during the repeated sweeping 

to compuS 1 oT^if" VeC ‘° riZed b > se “ in 8 op an inner loop 

given^ + i l col^ ° n 3 dia8 ° nal linC thr ° Ugh the mesh 

Line Relaxation 

The LR process is similar to the PR method, except that more 

r/ifsor? the LHs and a wock ,ridia8onai ° f e ™ a 

s is solved for an entire gnd line at once. The algorithm is 
t'rTh T emed SO i hat either computational direction can be selected 

SJS S5SS"- FOr S01 Ving lines 0f *. a ^-ard 

M, B, C]Ag ' +1 = -R" _ [£>]A 0 ,+ I _ [E]AQ' 
and a backward sweep is performed by solving 

IA, B, C]A0 ,+ l = -R" _ [D]AQl _ [E]aq i+ 1 
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A forward plus a backward sweep counts as two sweeps and I is i de- 
t j l I Dp) xhis process is also initialized by setting Q 

SJSSSto of fho tridiagonal system is form* 

tlnEe ,he .* of operations dunng 

The cweeDino process is recursive and cannot be vectonzea. i 
should noted that the LR and PR schemes each require the same 

number of operations per point per sweep, ca “ s ® . . tridiag- 

i m cnlvp 'l block diaaonal system instead of a block tna g 

° n t ipm it has additional block vector multiply operations for 
onal system, it has a ™n run faster on a vector 

the additional RHS terms, t ne rR scuc , pR rouline 

^■^535:35 

grid, all zones are swept on bef passed to all other zones 

^TX’S-r cooloo, ,» dns fashion, 

information is propagated across zonal boundaries implici y. 

procedure of Seed end Schultz' i. »"«'»“»' C 0 ' 
c JureioVsol.ing ,he linear sys.em of equal, enrol Ore form 

Mx — b = 0 

or. in the left preconditioned form, 

PMx - Pb = 0 

wtoe ; is ie^L 1 . S 

rS “.gt“«r convergence for , he OmBSjroc^- A 
GMRES procedure using k search directions, known as GMRES(k), 
Sis an approximation to the solution vector x given by 

x t = x n + >iUi + • ' ■ + y* u * 

where x„ is an initial guess to the solution x . The v, are orthonormal 
vectors formed from the process 

r„ = PMx o - Pb. fi = Wllbill 

Iterate: for j = 1.2 k: 

h, ! = (PMVj. Vi), i = 1.2 j 

w . + l = PMVj - b \ jV\ - /i2.yu 2 h ji v i 


-i -j 


= llulj + ll 


„. + 1 = Wj+ l/l|Uf/+ 1 II 

of "memory to apply to 

practical to use large values of k. Because oi m.s z 

capability of die OMRES a g^nm a ^ sutipl , b , seltin g 
conunue beginning. „ , loU l of 30 GMRES 

S,;h »n“ S sailed » d. code, then two »»» » 

can be stopped based on directions t0 be used. 

Sold" 5s S' 'SSelSS the 

."=s 

tolerance for the norm of the residual. 


Because the preconditioner must be utilized once for each GM- 
RES search direction, it needs to be relatively cheap. ^ * 
the PR and LR schemes are used as a preconditions for GMRES, 
only two sweeps of the relaxation process are used. 

Computed Results 

Fach of the different methods has been tested for three different 

flow over a NACA 441- ^oil, ^ ^ calculations 

weremn w diiermine an optimal time-step size, an op^Uahre 

for rrSe n « 

iSKSKS Zi 1 d w i- 

with At = 1°‘ 2 , 10 -°. 1-0 and 0 1 10 1 nd200 See 

cases were run using values of 0. 1 , 1,5,1 0 , zu, au, ’ e 

R t^ S-r-ee c JjJg* £ 

S3 S3S5 

sssSs=iss* 

time in milliseconds used to con s . lhe solut i on was 

SsediS 

is!sls=5 

overall performance of the algorithm. 

trrrused^egiid^ins^P-^* 

“EX backward-facing step flow, die only seh=m«d« did no, 

- 1 sr £ ,r Jessie-- - - 



pig- 1 

K te'l .} ity ve 

I facinj 


Grid and inflow veloc- 
vectors for the backward- 
facing step flow. 
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d) Effect of GMRES+PR search directions 



Fig. 2 Convergence for the backward-facing step flow. 


efficient with Ar = 100. Figure 2 shows the convergence for each 
scheme for a different number of sweeps/search directions. The LR 
scheme is best when using only 1 0 sweeps; the efficiency of the PR 
scheme is nearly constant as long as it is using at least 10 sweeps, 
rhe GMRES schemes are most efficient using either 5 or 10 search 
directions. For all of these schemes, it can be seen that when more 
sweeps/search directions are used they converge in fewer iterations, 
but the penalty of extra computing time per iteration causes this ap- 
proach to be less efficient. Figure 3 shows the best convergence plot 
for each method for the backward-facing step flow; a summary of 
the convergence properties is shown in Table 1 . Figure 3 and the last 


column of Table 1 show that GMRES+ILU converges fasterthan all 
other methods; the PR method is not far behind. The GMRES+LR 
scheme is less efficient than the LR scheme alone, even though it 
converges in significantly fewer iterations. The LR scheme shows 
the lowest amplification factor, but at nearly double the cost per 
iteration as the GMRES+ILU method and thus requires more time 
to converge. 

NACA 4412 Airfoil 

The flow over a NACA 4412 airfoil at a Reynolds number of 
1 .5 x 10 and an angle of attack of 13.87 deg was computed using 
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Table 1 Convergence for backward- facing step 


Table 3 Convergence for NACA 4412 grid 2 


Scheme 


0 At A.F. 


Microseconds/ 

point/iteration 


Milliseconds/ 

point 


Microseconds/ Milliseconds/ 
Scheme fi Ar A.F. point/iteration point _ 


LR(10) 

PR(10) 

ILU 

GMRES(5)+LR 

GMRES(10)+PR 

GMRES(10)+ILU 


0.1 

0.5 

0.1 

0.1 

0.1 

0.1 


10 12 

10 12 

1 

100 

10 12 

10 


12 


0.818 

0.912 

0.994 

0.788 

0.850 

0.826 


40.1 
16.7 

9.9 

61.4 

32.9 

21.1 


2.17 

1.69 
23.00 

2.70 
1.97 
1.14 


LR(10) 

PR(20) 

ILU 

GMRES(5)+LR 

GMRES(10)+PR 

GMRES(10)+ILU 



Fig. 3 



50 

50 

10 

5 

50 

5 


10 12 

10 12 

0.1 

10 12 

1.0 

10 12 


0.888 

0.914 

0.990 

0.827 

0.865 

0.780 


25.2 

26.6 

6.7 

40.5 
37.4 

19.6 


3.68 

5.13 

12.00 

3.77 

4.48 

1.45 



CPU seconds 

Summary of convergence for the backward-facing step flow. 


Fig. 4 119 X 31 grid around 

the NACA 4412 airfoil. 


four erids of varying refinement. The dimensions of these grids 
are 119 x 31.237 x 61.473 x 121 and 945 x 241. These wtll 
be referred to as grids 1. 2. 3, and 4, respectively. Gnds 1, 2, and 
3 were generated by removing every other point in each direction 
from the succeeding finer grid. The normal spacing on the finest gnd 
" t o 5 x t 0' -" chord!. Figure 4 shows the airfoil configuration 
with grid 1. This configuration has been studied «pefimenUdy 
by Coles and Wadcock. 1 ’ Previous computations wt th the INS2D 
code 20 showed good agreement with the experimental data. At this 
aSe of attack the airfoil is near stall conditions and has a small 
amount of separation occurring at the trailing edge. 

In the computations using grid 1 , all of the methods mwi 
steo of 1 0 12 . It was found that the LR scheme was most efficient using 
five sweeps. The PR runs converged well for 10, 20, and 40 sweeps, 
with 20 being the best. The GMRES schemes showed a loss » .effi- 
ciency when using more than 1 0 search directions. Figure 5 plotsthe 
^^convergence of each method. This figure shows similar trends 
She backward-facing sic P problem 

all other methods. Table 2 shows that all of the GMRES methods 



had remarkably low amplification factors, below 0.7 for -this » 

All methods had higher cost per point per iteration for ttas ca 
because the smaller dimensions led to shorter vector leng 
cost*per point to converge for the GMRES+ILU was the same as 

the laminar backward-facing step flow. 

The results for grid level 2 are shown in Fig. 6 and in Table 
For this grid, the GMRES+ILU approach outperforms allofthe 
other methods by at least a factor of 2. In all cases ;' h ®^ in 

factor and the cost per point have increased with the increase in 

^BasecTon the results thus far, it is apparent that LR ^and PR are 
too expensive to be used as effective preconditioned for GMRES _It 
was also found that the ILU scheme alone cannot handle finer gnd 

SSS SScncd by Ihc decay of <b ILU » 

0.99 for the previous case. Thus, for the remainder oft test cases in 
this study these three approaches have been elimina 
of Adduce remaining methods for the grid 3 case are shown « 
Fig. 7 and Table 4. All three methods ran best usmg a time step of 

1 0 Again the GMRES+ILU computations outperform the LR and 

scheme would converge for this problem. The results of th 
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Table 4 Convergence for NACA 4412 grid 3 Table 6 Convergence for three-element arifoil 


Scheme 

p 

Ar 

A.F. 

Microseconds/ 

point/iteration 

Milliseconds/ 

point 

Scheme 

0 

Ar 

A.F. 

Microseconds/ 

point/iteration 

Milliseconds/ 

point 

LR(10) 

100 

1.0 

0.953 

23.2 

8.79 

LR(10) 

100 

1.0 a 

0.990 

22.6 

49.7 

PR(40) 

5 

1.0 

0.929 

51.3 

12.56 

PR(20) 

10 

1.0 

0.927 

52.8 

14.7 

GMRES(10)+ILU 

5 

1.0 

0.917 

18.7 

3.91 

GMR(10)+ILU 

10 

I0 12 

0.892 

29.2 

5.37 


a Ar reduced by a factor of 0.5 every 250 iterations. 


Table 5 Convergence for NACA 4412 grid 4 


Microseconds/ Milliseconds/ 

Scheme 0 Ar A.F. point/iteration point 



CPU seconds 

Fig. 7 Convergence for the flow over a NACA 4412 airfoil with grid 3. 



shown in Fig. 8 and Table 5. Examining the trend of the convergence 
rate over all four grid levels shows that all methods suffered an 
increase in the amplification factor with increasing density. The 
effect of this is shown in Fig. 9, which plots the CPU time required 
to reach convergence per point vs the grid level for all of the airfoil 
computations. Even the most robust method, GMRES+ILU, saw an 
order-of-magnitude increase in this CPU cost from the coarsest grid 
to the finest grid. All of the methods required a reduction in the time- 
step size as the grid levels increased. Increasing the 0 parameter 
tended to be a stabilizing influence; the LR and PR methods tended 
to require larger 0 for the finer grids. One apparent contradiction 
to this is the use of 0 = 5 for the PR grid 3 case. In fact, the PR 
convergence for grid 3 did not vary greatly with 0 ; values from 5 
to 100 all converged slowly. The 0 = 5 case was slightly better 
than the 0 = 100 run. One distinct advantage of the GMRES+ILU 
method is that it has optimum convergence with the same value of 
0 for all of the grid levels. 



Fig. 9 CPU time to converge per point vs grid level for the flow over a 
NACA 4412 airfoil. 



Three-Element Airfoil 

The flow over a three-element airfoil was computed at an angle of 
attack of 8.0 deg and a Reynolds number of 9 x 1 0 6 . This geometry 
has recently become a standard test case for multielement airfoil 
flows. This geometry is a McDonnell Douglas airfoil and has been 
tested extensively at the NASA Langley low-turbulence pressure 
tunnel (LTPT). 21 The configuration consists of a leading-edge slat 
deflected 30 deg and a trailing-edge flap deflected 30 deg. This 
geometry has been used as a test case for the INS2D flow solver for 
a number of turbulence model and grid resolution studies. 5 The grid 
and flow conditions of the current problem were some of the most 
difficult cases to converge in the previous study. 

Figure 10 shows the grids used for the three-element airfoil. For 
clarity, only every other grid line in each direction is shown. A total 
of 68,000 grid points and six zones were used; a 121 x 41 C grid 
around the slat (top of Fig. 10); a 321 x 101 C grid around the main 
element (near field shown in middle of Fig. 10); a 141 x 51 C grid 
around the flap (top of Fig. 10); a 41 x 31 H grid in the wake of 
the flap (bottom of Fig. 10); a 131 x 61 H grid extending from the 
main elements’ flap cove to the downstream far field (bottom of 
Fig. 10); and a 141 x 101 embedded grid above the flap, used to 
help resolve the merging wake in this region (middle of Fig. 10). The 
normal wall spacing for all grids is 2 x 10~ 6 chords. The overlaid 
chimera scheme allows individual grids to be generated for each 
airfoil element. When the grid for one element intersects another 
airfoil element, a hole is cut to remove grid points lying inside the 
element. This creates a hole boundary. The fringe-point variables 
on the hole boundaries are updated by interpolating the value of 
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Fig. 11 Summary of convergence for the flow over a three-element 
airfoil. 

the dependent variables from interior points of neighboring grids. 
Similarly, the variables on the outer boundaries of all but the main- 
element grid are updated using interpolation of dependent variables 
from neighboring grids. 

The LR and PR computations for this configuration required a re- 
duction inthe time step to At = 1. 0 to remain stable; GMRES-bILU 
ran with a time step of 10 12 . The LR computations were extremely 
slow to converge and were found to benefit from a further decrease 
in the time step during the calculations, and so the LR time step was 
decreased by a factor of 2 every 250 iterations. The PR computations 
did not suffer from the same problems. It was found that the PR com- 
putations converged best using 20 sweeps. The GMRES+ ILU com- 
putations performed best with 10 search directions. The best perfor- 
mance of each method is plotted in Fig. 1 1 and shown in Table 6. It 
can be seen that the GMRES-f ILU method outperformed the other 
two; it converges about nine times faster than the LR scheme and 
about three times faster than the PR computations. 

Conclusion 

Several implicit schemes have been implemented into the INS2D 
code and tested for the flow over a backward-facing step, a NACA 
airfoil, and a three-element airfoil. The results indicate that when 
running on a single processor the GMRES method preconditioned 
with ILU factorization outperforms line relaxation and point relax- 
ation and that the latter two methods are not efficient precondition- 
ers for GMRES. The GMRES+ILU method has provided between 
a factor of 2 and 9 improvement in CPU costs over previously pub- 
lished results of the INS2D code. In addition, the point-relaxation 
scheme performed remarkably well for the three-element airfoil 
problem. Extensive tests have indicated that GMRES is most ef- 
ficient when using 10 search directions; if more search directions 
are used, the computation will converge in fewer iterations, but at 
a greater cost. The GMRES -HlLU algorithm also remained more 
stable than the LR or PR methods; it ran with larger time steps and 
did not require an increase in the value of the artificial compress- 
ibility constant for the finger grids used in the NACA 4412 airfoil 
calculations. 

The GMRES-bILU algorithm has been shown to work quite 
well for a two-dimensional flow solver. The success with this al- 
gorithm has also been observed recently by others. 6,9-11 In partic- 
ular, Venkatakrishnan and Mavriplis 11 performed a similar study 
of implicit solvers for an unstructured flow code; they found their 
GMRES+ILU worked best. As it was utilized here, the GMRES 
algorithm would be very memory intensive for a three-dimensional 
flow solver. Future work will concentrate on utilizing a matrix-free 


method of the GMRES algorithm. The most important part of this 
work will be determining an effective preconditioner that will not 
require significant amounts of memory. 
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